The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The aim of this work is to introduce MaRF, a novel framework able to synthesize the Martian environment using several collections of images from rover cameras. The idea is to generate a 3D scene of Mars' surface to address key challenges in planetary surface exploration such as: planetary geology, simulated navigation and shape analysis. Although there exist different methods to enable a 3D reconstruction of Mars' surface, they rely on classical computer graphics techniques that incur high amounts of computational resources during the reconstruction process, and have limitations with generalizing reconstructions to unseen scenes and adapting to new images coming from rover cameras. The proposed framework solves the aforementioned limitations by exploiting Neural Radiance Fields (NeRFs), a method that synthesize complex scenes by optimizing a continuous volumetric scene function using a sparse set of images. To speed up the learning process, we replaced the sparse set of rover images with their neural graphics primitives (NGPs), a set of vectors of fixed length that are learned to preserve the information of the original images in a significantly smaller size. In the experimental section, we demonstrate the environments created from actual Mars datasets captured by Curiosity rover, Perseverance rover and Ingenuity helicopter, all of which are available on the Planetary Data System (PDS).
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard. Our approach allows us to create a single, low-latency model to simultaneously perform sentence segmentation and classification of these sentences into task/non-task sentences. SLATE greatly outperforms a baseline two-model (sentence segmentation followed by classification model) approach, achieving a task F1 score of 84.4\%, a sentence segmentation (boundary similarity) score of 88.4% and three times lower latency compared to the baseline. Furthermore, we provide insights into tackling challenges of performing NLP on the inking domain. We release both our code and dataset for this novel task.
translated by 谷歌翻译
我们介绍了MLPERF小型推理基准(FPGA)平台上MLPERF微小的推理基准的最新结果。我们使用开源HLS4ML和Finn工作流,旨在使FPGA中优化神经网络的AI硬件代码民主化。我们介绍关键字发现,异常检测和图像分类基准任务的设计和实现过程。最终的硬件实现是针对速度和效率量身定制的,可配置的,可配置的空间数据流体系结构,并引入了新的通用优化和作为本工作的一部分开发的常见工作流程。完整的工作流程从量化感知培训到FPGA实施。该解决方案部署在芯片(PYNQ-Z2)和纯FPGA(ARTY A7-100T)平台上。由此产生的提交的潜伏期低至20 $ \ mu $ s和每次推论的低至30 $ \ mu $ j的能耗。我们展示了异质硬件平台上新兴的ML基准如何催化协作和开发新技术和更容易访问的工具。
translated by 谷歌翻译
与当前的AI或机器人系统相比,人类轻松地在环境中导航,使数据收集诸如Trivial之类的任务。但是,人类发现很难模拟隐藏在数据中的复杂关系。 AI系统,尤其是深度学习(DL)算法,令人印象深刻地捕获了这些复杂的关系。共生耦合人类和计算机的优势可以同时最大程度地减少所需的数据并构建复杂的输入到输出映射模型。本文通过展示新型的人机相互作用框架来使用最小的数据执行故障诊断来实现这种耦合。收集用于诊断复杂系统故障的数据是困难且耗时的。最小化所需数据将增加数据驱动模型在诊断故障时的实用性。该框架为人类用户提供指令,以收集数据,以减轻用于训练和测试故障诊断模型的数据之间的差异。该框架由三个组成部分组成:(1)用于开发培训数据集的数据收集的增强学习算法,(2)用于诊断故障的深度学习算法,以及(3)手持式数据收集数据收集的手持式现实应用程序,用于测试数据。所提出的框架提供了超过100 \%的精度,并在一个新的数据集上召回了每个故障条件的一个实例。此外,还进行了一项可用性研究,以评估手持式增强现实应用程序的用户体验,并且所有用户都能够遵循提供的步骤。
translated by 谷歌翻译
预后有助于实地系统或产品的寿命。量化该系统的当前健康状况使预后能够增强操作员的决策以保护系统的健康状况。由于(a)未知的身体关系和/(b)数据中的不规则性远远超出了问题的启动,因此很难为系统创建预后。传统上,三种不同的建模范例已被用来开发预后模型:基于物理学(PBM),数据驱动(DDM)和混合模型。最近,结合了基于PBM和DDM的方法并减轻其局限性的混合建模方法在预后域中获得了吸引力。在本文中,概述了基于模糊逻辑和生成对抗网络(GAN)的概念的组合概念的一种新型混合建模方法。基于Fuzzygan的方法将基于物理的模型嵌入模糊含义的聚集中。该技术将学习方法的输出限制为现实解决方案。轴承问题的结果表明,在模糊逻辑模型中添加基于物理的聚集的功效,以提高GAN对健康建模的能力并提供更准确的系统预后。
translated by 谷歌翻译
Here, we demonstrate how machine learning enables the prediction of comonomers reactivity ratios based on the molecular structure of monomers. We combined multi-task learning, multi-inputs, and Graph Attention Network to build a model capable of predicting reactivity ratios based on the monomers chemical structures.
translated by 谷歌翻译
Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.
translated by 谷歌翻译
Machine Reading Comprehension has become one of the most advanced and popular research topics in the fields of Natural Language Processing in recent years. The classification of answerability questions is a relatively significant sub-task in machine reading comprehension; however, there haven't been many studies. Retro-Reader is one of the studies that has solved this problem effectively. However, the encoders of most traditional machine reading comprehension models in general and Retro-Reader, in particular, have not been able to exploit the contextual semantic information of the context completely. Inspired by SemBERT, we use semantic role labels from the SRL task to add semantics to pre-trained language models such as mBERT, XLM-R, PhoBERT. This experiment was conducted to compare the influence of semantics on the classification of answerability for the Vietnamese machine reading comprehension. Additionally, we hope this experiment will enhance the encoder for the Retro-Reader model's Sketchy Reading Module. The improved Retro-Reader model's encoder with semantics was first applied to the Vietnamese Machine Reading Comprehension task and obtained positive results.
translated by 谷歌翻译